OpenStack Object Storage 


ooftware to reliably store billions of objects 
distributed across standard hardware 
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Training Goals 


› By the end of this course you should: 


» 


» 


» 


Be able to list the components in Swift 
Have an understanding of the Swift architecture 


Be able to interact with an OpenStack Object Storage 
deployment 


Know where to go to find more information 


Deploying OpenStack: What does it take? 


= 


Operational Expertise 


п Proven, Scalable Open Source Operating System 


opensta cc | Tested, cloud-optimized hardware 
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Object Storage Summary 


| FULLY DISTRIBUTED | 
| COMMODITY HARDWARE | 
| FEATURES OPTIMIZED FOR SCALE | 


| DATA PROTECTION IN SOFTWARE | 
| NOTA FILESYSTEM | 
| AUGMENTS SAN/NAS/DAS, DOESN'T REPLACE | 
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Why 


› Explosion in 
unstructured data 


Global Storage Growth 


› High operational costs 


С Source: IBM and IDC forecasts 
openstaclc 
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Zettabyte 


1,000 Exabytes 
1,000,000 Petabytes 
All of the data on Earth today 
(150GB of data per person) 
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Zettabyte 


2% OF THE DATA ON EARTH IN 2020 
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OpenStack Object Storage Key Features 


REST-based API 


Data distributed evenly 
throughout system 


Scalable to multiple 
petabytes, billions of 
objects 


Account/Container/Object 
structure (not file system, no nesting) 
plus Replication (N copies of 
accounts, containers, objects) 


No central 
database 
Hardware agnostic: standard 
hardware, RAID not required 
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Swift vs. RAID 


SWIFT 


Massively scalable multiple container storage 


Easily add capacity, only moving rebalanced data 


66%+ loss of capacity 
3x+ data redundancy 
Designed for remote large/long term file storage 


Uses commodity hardware 


openstaclc 


8 


Tuesday, October 23, 2012 


RAID 


Limited to # of disks in a physical form factor 
May not be possible to resize 

0-5096 loss of data capacity 

0-2x maximum data redundancy 

Designed for performance/direct access 


Typically requires high end hardware 


Evolution of Object Storage Architecture 


Version 1: Central DB Version 2: Fully Distributed 
(Rackspace Cloud Files 2008) (OpenStack Object Storage 2010) 


Proxy 
Servers 


Storage 
Servers 


Container 
Servers 


Job 
Servers 
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owift Components 


Чол 


Proxy Server 


=== [ Container Servers 


С [ Object Servers 
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owift Components 


Account 


Container Container 


Container 
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Swift Components: The Ring 
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Swift Components: The Ring 
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Swift Components: The Ring 


PUT /v1.0/<account_id>/<container>/<object> 


ecb25d1facd7c6760f7663e394dbeddb 


(Upload object to container) 


y 


Location I 
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Swift Components: The Ring 


Object 


е 


ddc5f5e86d2185e1b1ff763aff1 Зсеба 


Object 
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Location 


owift Components (Object Ring) 


PUT /v1.0/<account_id>/<container>/<object> 


(Upload object to container) 1 


ecb25d1facd7c6760f7663e394dbeddb 
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Partition 293823 
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Object Ring 
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Swift Components: The Ring 
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Swift Zones 
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owift Object Replicas 


ADE 


Zone 1 Zone 2 Zone 3 
Object Server Object Server Object Server 
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x 10.0.0.4 ru 
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z | - 10.0.0. l :6000/sda 1000 
21-10.0.0.1:6000/54Ь 1000 
z1-10.0.0.2:6000/sda 2000 
z1-10.0.0.3:6000/sda 2000 
21-10.0.0.3:6000/54Ь 1000 
z1-10.0.0.3:6000/sdc 1000 


Z2-10.0.0.4:6000/sda 4000 
22-10.0.0.5:6000/ѕаа 1000 
z2-10.0.0.5:6000/sdb 1000 
22-10.0.0.5:6000/ѕас 1000 
z2-10.0.0.5:6000/sdd 1000 


Z3-10.0.0.6:6000/sda 1000 
z3-10.0.0.6:6000/sdb 1000 
z3-10.0.0.7:6000/sda 1000 
z3-10.0.0.7:6000/sdb 1000 
z3-10.0.0.8:6000/sda 1000 
z3-10.0.0.8:6000/sdb 1000 
z3-10.0.0.9:6000/sda 1000 
z3-10.0.0.9:6000/sdb 1000 


Swift - Writing Objects to the Cluster 
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Swift - Reading Objects from the Cluster 


A^ - 
2опе 1 Zone 3 
Object Server Object Server Object Server 
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owift - Replication 


Zone 3 
Object Server 
©, 


Zone 2 
Object Server 
е, 


Zone 1 
Object Server 
a 
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Swift - Handoff Nodes 


Zone 5 


Object Server 


Zone 4 


Zone 2 Zone 3 


Zone 1 


Object Server 


Storage Node 
Unavailable 
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Swift - Deleting Objects from the Cluster 


ADE 
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How many rings does Swift have? 


Object Ring PUT /v1.0/<account_id>/<container>/<object> 


Container Ring 


GET /v1.0/«account id»/«container» 


Account Ring /. | 


| GET /v1.0/«account id» 
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owift Components (Container Ring) 


GET /v1.0/<account_id>/<container> 
(Get objects in container) Ф, 


415b952f70ceff5ee85cfcae165ed329 


Y 


Partition #3764 
z2-10.1.0.13:6001/sdg1 Qu 
z4-10.1.0.6:6001/sdal 
z1-10.1.0.12:6001/sdc1 


ntainer Ring 


openstaclc 


28 


Tuesday, October 23, 2012 


owift Components (Account Віпа) 


GET /v1.0/«account id» 
(Get containers in account) Y 


89c5270c0e27c648cd2a27e0034f3b85 


Y 


Partition #341 
z3-10.1.0.26:6002/sdc1 (> 
26-10.1.0.18:6002/sdj1 
z5-10.1.0.32:6002/sdm1 


Ccount Ring 
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Updating Account & Container Databases 
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oystem Components 


^ The Ring: Mapping of names to entities (accounts, containers, 


objects) on disk. 


› Stores data based on zones, devices, partitions, and replicas 
^ Weights can be used to balance the distribution of partitions 


^ Used by both the proxy server and storage nodes for many background processes 


^ Proxy Server: Request routing, exposes the public АР! 


^ Object Server: Blob storage server, uses xattrs, uses binary format 


› Recommended to run on XFS 


^ Object location based on path from name hash & timestamp 
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oystem Components (Cont.) 


^ Container Server: Handles listing of objects, stores as SQLite DB 

^ Account Server: Handles listing of containers, stores as SQLite DB 
^ Replication: Keep the system consistent, handle failures 

^ Updaters: Process failed or queued updates 


^ Auditors: Verify integrity of objects, containers, and accounts 
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Heplication 


› Account and Container replication 
^ Hash comparison of SQLite databases per node 
^ Update only from row X based on tuple of known records 
^ If DB is missing, entire DB is pushed 
› Object replication 
^ Hash comparison of directories and files 
^ Rsync worker for changed folders only 


^ Push based approach 
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Basic user management 


^ Swift acounts/users can be managed with built in utilities starting with 
"swauth-" 


› Adding an admin/user: 


^ swauth-add-user -A http://localhost:8080/auth/ -K superpass —а 
account username password 


^ Verify users, accounts, passwords 
^ swauth-list -A http://localhost:8080/auth/ -K superpass [account] 
^ Delete a user: 


^ swauth-delete-user -A http://localhost:8080/auth/ -K superpass 
account username 
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Exercise: Basic user management 


^ Create an account called “testaccount” with users: 
› тез (administrative account) 


» test2 (non-administrative account) 


» Verify the users exist using swauth-list 
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owift Tool CLI ("swift") 


ь Swift is part of the swift packages 
› Show storage stats for a user: 


^ swift -A http://url:8080/auth/v1.0/ -U account:user -K 
password stat 


› Upload a file: 


^ swift -A http://url:8080/auth/v1.0/ -U account:user —R 
password upload container yourfile.txt 


> Download your file: 


^ swift -A http://url:8080/auth/v1.0/ -U account:user —R 
password download container yourfile.txt -о outputfile.txt 
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Exercise: Swift Tool CLI ("swift") 


^ Create a test container called "testcontainer" using the test1 
administrative account previously created 


› Upload а file to the test container 
^ Verify it can be downloaded with the test1 user 


› Can the file be downloaded with the test2 user? 
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Swift ACLs 


^ Read and Write ACLs, set with “swift post -r/-w" 
^ Based on referrer, account, or user 
^ Heferrer 
^ „ге (all referrers) 
^ .r.somewhere.com (only from *.somewhere.com) 
^ .r:-.microsoft.com (пої from *.microsoft.com 
› Accounts/Users 
› testaccount (any user in the testaccount account) 


› testaccount:test1 (only the test1 user) 
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Swift ACLs 


» ACLs can be combined 
^ .r27,.r:-specifichost.specificdomain.com 


› testaccount:test1,testaccount:test2 


) ACLs evaluated left to right, last ACL wins 
^ .r:-specifichost.specificdomain.com,.r:* 

^ Bad: still allows specifichost 
^ .r7,.r:-specifichost.specificdomain.com 


^ Good: allows anyone except specifichost 
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Exercise: ACLs 


^ Can the test2 user download the test file? 
» Why not? 

› Set read ACLs on the container to allow test2 to read the file 
^ Can the test2 user read the test file? 
^ Can it write a new file to the container? 


› Set write ACLs on the container to allow test2 to upload a file 
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Operations 


» If a single drive fails and is not expected to be replaced quickly 
unmount the drive and remove it from the ring using swift-ring- 
builder so Swift can work around the failure. 


› Once the drive is replaced add it back to the ring and properly 
mount it. 


› The replication services will automatically repopulate the data on 
the drive. 


› owift-drive-audit can be used in a cron to audit the kern.log and 
unmount any drives that appear to be reaching a failure threshold. 
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Exercise: Observing handoff operation 


^ Observe storage locations with swift-get-nodes 
>» Unmount drives and watch data move 
› Remove drive from rings and push ring data 


› Observe data motion 
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Exercise: Drive Auditing 


› Install swift-drive-audit script 


› Set up drive auditor to run out of cron 
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Operations 


› Та storage node fails, determine length of time the node will be 
out of service 


^ Long period of time: Remove the node from the ring using 
swift-ring-builder so Swift can work around the failure. 


^ Short period of time: Swap chassis/replace node and let 
replication bring the device back into sync 
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owift monitoring 


^ Lots of metrics! 

^ Host/Network (Traditional monitoring) 
^ Cabuplinks 
^ Proxy interfaces 
› LB interfaces 

^ Log trawling 
^ Bytes іп, out, GETs, PUTs, POSTs, etc 
» Proxy response codes 


› Replication times 
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owift monitoring (etc) 


>» Swift-specific monitoring 
» Storage capacity (swift-stats) 
^ Async Pending (manual script) 


^ Dispersion 
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Exercise: Dispersion Reports 


› Set up dispersion reports 
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Additional Незоигсез 


>» Swift administration guide: 


» http://swift.openstack.org/admin guide.htmi 
› The Ring - explained in 5 parts: 


^ http://tlohg.wordpress.com/201 1/02/07/building-a-consistent- 
hashing-ring-part- 1/ 
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Аррепаїх 
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Partition Power 


Determining your Ring Size 


Drives at Max Number of Ring Target Number of 
Cluster Size Partitions Per Drive Partitions in the Ring 


Partition Power 
setting in Ring 
Builder 


Closest Partition Total Number of 
Power Partitions in the Ring 
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Hesources 


htto:/Awww.openstack.or 


https :/Aaunchpad.net/openstack 


https ://github.com/openstack 


https://github.com/cloudbuilders 
http://www.referencearchitecture.org/ 
http://devstack.org/ 
http://programmerthoughts.com/ 
http://www.unchainyourbrain.com 


http://www.tlohg.com 
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Uploading to Swift 


Object селе О Object seve O Object sev О 
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ploading to Swi 


Object Server Object seve Object sone 
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ploading to Swi 


Object Server Object seve Object serve О 
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ploading to Swi 


Object Server Object seve O Object sev О 
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Downloading From Swift 


Object Server Object Server Object Server 
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